NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Estimation of genetic admixture proportions via haplotypes

https://doi.org/10.1016/j.csbj.2024.11.043

Ko, Seyoon; Sobel, Eric M; Zhou, Hua; Lange, Kenneth (December 2024, Computational and Structural Biotechnology Journal)

Full Text Available
Multivariate genome-wide association analysis by iterative hard thresholding

https://doi.org/10.1093/bioinformatics/btad193

Chu, Benjamin B; Ko, Seyoon; Zhou, Jin J; Jensen, Aubrey; Zhou, Hua; Sinsheimer, Janet S; Lange, Kenneth (April 2023, Bioinformatics)
Marschall, Tobias (Ed.)
Abstract MotivationIn a genome-wide association study, analyzing multiple correlated traits simultaneously is potentially superior to analyzing the traits one by one. Standard methods for multivariate genome-wide association study operate marker-by-marker and are computationally intensive. ResultsWe present a sparsity constrained regression algorithm for multivariate genome-wide association study based on iterative hard thresholding and implement it in a convenient Julia package MendelIHT.jl. In simulation studies with up to 100 quantitative traits, iterative hard thresholding exhibits similar true positive rates, smaller false positive rates, and faster execution times than GEMMA’s linear mixed models and mv-PLINK’s canonical correlation analysis. On UK Biobank data with 470 228 variants, MendelIHT completed a three-trait joint analysis (n=185 656) in 20 h and an 18-trait joint analysis (n=104 264) in 53 h with an 80 GB memory footprint. In short, MendelIHT enables geneticists to fit a single regression model that simultaneously considers the effect of all SNPs and dozens of traits. Availability and implementationSoftware, documentation, and scripts to reproduce our results are available from https://github.com/OpenMendel/MendelIHT.jl.
more » « less
Full Text Available
Unsupervised discovery of ancestry-informative markers and genetic admixture proportions in biobank-scale datasets

https://doi.org/10.1016/j.ajhg.2022.12.008

Ko, Seyoon; Chu, Benjamin B.; Peterson, Daniel; Okenwa, Chidera; Papp, Jeanette C.; Alexander, David H.; Sobel, Eric M.; Zhou, Hua; Lange, Kenneth L. (February 2023, The American Journal of Human Genetics)

Full Text Available
VCSEL: Prioritizing SNP-set by penalized variance component selection

https://doi.org/10.1214/21-AOAS1491

Kim, Juhyun; Shen, Judong; Wang, Anran; Mehrotra, Devan V.; Ko, Seyoon; Zhou, Jin J.; Zhou, Hua (December 2021, The Annals of Applied Statistics)

Full Text Available
Systematic Heritability and Heritability Enrichment Analysis for Diabetes Complications in UK Biobank and ACCORD Studies

https://doi.org/10.2337/db21-0839

Kim, Juhyun; Jensen, Aubrey; Ko, Seyoon; Raghavan, Sridharan; Phillips, Lawrence S.; Hung, Adriana; Sun, Yan; Zhou, Hua; Reaven, Peter; Zhou, Jin J. (February 2022, Diabetes)

Diabetes-related complications reflect longstanding damage to small and large vessels throughout the body. In addition to the duration of diabetes and poor glycemic control, genetic factors are important contributors to the variability in the development of vascular complications. Early heritability studies found strong familial clustering of both macrovascular and microvascular complications. However, they were limited by small sample sizes and large phenotypic heterogeneity, leading to less accurate estimates. We take advantage of two independent studies—UK Biobank and the Action to Control Cardiovascular Risk in Diabetes trial—to survey the single nucleotide polymorphism heritability for diabetes microvascular (diabetic kidney disease and diabetic retinopathy) and macrovascular (cardiovascular events) complications. Heritability for diabetic kidney disease was estimated at 29%. The heritability estimate for microalbuminuria ranged from 24 to 60% and was 41% for macroalbuminuria. Heritability estimates of diabetic retinopathy ranged from 6 to 33%, depending on the phenotype definition. More severe diabetes retinopathy possessed higher genetic contributions. We show, for the first time, that rare variants account for much of the heritability of diabetic retinopathy. This study suggests that a large portion of the genetic risk of diabetes complications is yet to be discovered and emphasizes the need for additional genetic studies of diabetes complications.
more » « less
Full Text Available
GWAS of longitudinal trajectories at biobank scale

https://doi.org/10.1016/j.ajhg.2022.01.018

Ko, Seyoon; German, Christopher A.; Jensen, Aubrey; Shen, Judong; Wang, Anran; Mehrotra, Devan V.; Sun, Yan V.; Sinsheimer, Janet S.; Zhou, Hua; Zhou, Jin J. (March 2022, The American Journal of Human Genetics)

Full Text Available
A fast data-driven method for genotype imputation, phasing and local ancestry inference: MendelImpute.jl

https://doi.org/10.1093/bioinformatics/btab489

Chu, Benjamin B; Sobel, Eric M; Wasiolek, Rory; Ko, Seyoon; Sinsheimer, Janet S; Zhou, Hua; Lange, Kenneth (July 2021, Bioinformatics)
Kelso, Janet (Ed.)
Abstract Motivation Current methods for genotype imputation and phasing exploit the volume of data in haplotype reference panels and rely on hidden Markov models (HMMs). Existing programs all have essentially the same imputation accuracy, are computationally intensive and generally require prephasing the typed markers. Results We introduce a novel data-mining method for genotype imputation and phasing that substitutes highly efficient linear algebra routines for HMM calculations. This strategy, embodied in our Julia program MendelImpute.jl, avoids explicit assumptions about recombination and population structure while delivering similar prediction accuracy, better memory usage and an order of magnitude or better run-times compared to the fastest competing method. MendelImpute operates on both dosage data and unphased genotype data and simultaneously imputes missing genotypes and phase at both the typed and untyped SNPs (single nucleotide polymorphisms). Finally, MendelImpute naturally extends to global and local ancestry estimation and lends itself to new strategies for data compression and hence faster data transport and sharing. Availability and implementation Software, documentation and scripts to reproduce our results are available from https://github.com/OpenMendel/MendelImpute.jl. Supplementary information Supplementary data are available at Bioinformatics online.
more » « less
Full Text Available

Search for: All records